A comparison of fusion techniques in mel-cepstral based speaker identification

نویسندگان

  • Stefan Slomka
  • Sridha Sridharan
  • Vinod Chandran
چکیده

Input level fusion and output level fusion methods are compared for fusing Mel-frequency Cepstral Coefficients with their corresponding delta coefficients. A 49 speaker subset of the King database is used under wideband and telephone conditions. The best input level fusion system is more computationally complex than the output level fusion system. Both input and output fusion systems were able to outperform the best purely MFCC based system for wideband data. For King telephone data, only the output level fusion based system was able to outperform the best purely MFCC based system. Further experiments using NIST’96 data under matched and mismatched conditions were also performed. Provided it was well tuned, we found that the output level fused system always outperformed the input level fused system under all experimental conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification

This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral fea...

متن کامل

A Framework for Multilingual Text- Independent speaker identification System

This article evaluates the performance of Extreme Learning Machine (ELM) and Gaussian Mixture Model (GMM) in the context of text independent Multi lingual speaker identification for recorded and synthesized speeches. The type and number of filters in the filter bank, number of samples in each frame of the speech signal and fusion of model scores play a vital role in speaker identification accur...

متن کامل

Fusion of Cross Stream Information in Speaker Verification

This paper addresses the performance of various statistical data fusion techniques for combining the complementary score information in speaker verification. The complementary verification scores are based on the static and delta cepstral features. Both LPCC (Linear prediction-based cepstral coefficients) and MFCC (mel-frequency cepstral coefficients) are considered in the study. The experiment...

متن کامل

Evaluation of a speaker identification system with and without fusion using three databases in the presence of noise and handset effects

In this study, a speaker identification system is considered consisting of a feature extraction stage which utilizes both power normalized cepstral coefficients (PNCCs) and Mel frequency cepstral coefficients (MFCC). Normalization is applied by employing cepstral mean and variance normalization (CMVN) and feature warping (FW), together with acoustic modeling using a Gaussian mixture model-unive...

متن کامل

Improving Speaker Identification Performance by Combining Vocal Tract Features

This paper proposes fusion and addition techniques of vocal tract features such as Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Mel Frequency Cepstral Coefficients (DMFCC) in speaker identification. Feature extraction plays an important role as a front end processing block in Speaker Identification (SI) process. Mel frequency features are used to extract the spectral characteristics o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998